Expand ARM Architecture Compatibility #5954

halibobo1205 · 2024-08-15T03:31:39Z

Background

Java-Tron currently only supports the x86 architecture. Nevertheless, ARM architecture has gained significant traction recently, especially in cloud computing and mobile devices. ARM processors are known for their energy efficiency and cost-effectiveness, making them increasingly popular in data centers, cloud computing, and edge computing scenarios. It will be great to have an option to run Java-Tron using the ARM architecture.

Key developments in ARM architecture:

Apple's transition from Intel to its ARM-based processors for Macs, dubbed Apple Silicon
AWS Graviton, Microsoft Azure Ampere Altra, Alibaba Cloud Yitian 710 and Google Axion processors used in cloud computing
Increasing adoption of ARM in supercomputers and high-performance computing

ARM advantages:

Better performance per watt, low energy consumption, and low cost
- 23% Cost savings and 36% performance gain by deploying GitLab on Arm-based AWS Graviton2
Lower total cost of ownership for data centers
Growing ecosystem and software support

Related Issues and PRs

Scope of Impact

Build and deployment processes
Core application code
Third-party dependencies
Development and testing environments

Current Progress Summary

JDK version
- Upgrade to JDK 17 for ARM Architecture #5976
Native code
- RocksDBJNI: org.rocksdb:rocksdbjni:7.7.3
- LevelDBJNI: com.halibobor:leveldbjni-all:1.18.3, just for CI Test, production environment only supports RocksDB
- zksnark-java-sdk: io.github.tronprotocol:zksnark-java-sdk:1.0.0
Third-party dependencies
- protoc-gen-grpc-java: GreatVoyage-v4.7.3 released Oct 25, 2023, upgraded io.grpc:protoc-gen-grpc-java from 1.9.0 to 1.52.1 for ARM64 architecture by feat(net):update com.google.protobuf and io.grpc version #5254
Floating-point arithmetic
- Migrate pow operation from java.lang.Math to java.lang.StrictMath in GreatVoyage-v4.7.7(Epicurus)
- Replace java.math with the cross-platform consistent java.strictMath in GreatVoyage-v4.8.0(Kant)
- For historical data(the Bancor trading pair), Hardcoded Special Cases(48 POW calculation instances)
Build and deployment process
- Update build scripts to support ARM architecture.
- Ensure CI/CD pipelines can be built and tested in ARM environments.
- Docker support

angrynurd · 2024-08-15T03:41:24Z

I am totally in favor of extending ARM architecture compatibility. This will allow Java-Tron to run on more platforms and take advantage of the benefits of the ARM architecture, such as higher energy efficiency and lower cost.
In my opinion, we can start with the following:

Prioritize ARM support for key dependencies: For example, RocksDB/LevelDB is an important database component in Java-Tron and it is critical to ensure its compatibility on ARM.
Establish an ARM test environment: We need to establish a dedicated ARM test environment to ensure the stability and performance of Java-Tron on ARM.
Collaborate with the community: We can work with the community to solve ARM compatibility issues and share experiences and best practices.

tomatoishealthy · 2024-08-15T03:51:51Z

It sounds great, but I am a novice in ARM architecture. I am curious about the challenges of supporting ARM architecture.

Can you list something like a task list in the future? It is convenient to clearly understand the current status and future challenges.

halibobo1205 · 2024-08-15T04:16:21Z

Here are some common considerations:

Important

JDK version compatibility
Ensure the JDK version supports ARM Architecture. Consider using ARM-optimized JDK distributions.

Linux got support in JDK 9(non-LTS) by JEP 237
Windows got support in JDK 16(non-LTS) by JEP 388
Macs got support in JDK 17(LTS) by JEP 391

Important

2. Native code
JNI (Java Native Interface) or other native code.
These native code components need to be recompiled or upgraded for ARM architecture.

LevelDBJni
RocksDBJni
zksnark-java-sdk

Tip

3. Endianness
x86 is little-endian, while some ARM processors may be big-endian.
Check if any operations in the code(such as TVM) depend on a specific endianness, especially when handling binary data.

Tip

4. Memory alignment:
ARM architecture may have different memory alignment requirements than x86.
Check for code(such as TVM) that assumes specific memory alignments.

Tip

5. Atomic operations and concurrency
Some atomic operations(TVM) may be implemented differently on different architectures.
Review concurrent code to ensure it works correctly on ARM as well.

Caution

6. Floating-point arithmetic
ARM and x86 may have subtle differences in floating-point precision and behavior.
For applications that rely on precise floating-point calculations, comprehensive testing is necessary.

Tip

7. Performance optimization
x86-specific performance optimizations(TVM) may no longer be applicable on ARM.
Consider using ARM-specific optimization techniques.

Important

8. Third-party dependencies
Ensure all third-party libraries and dependencies support ARM architecture.
Some incompatible dependencies may need to be updated or replaced.

protoc-gen-grpc-java

Important

9. Build and deployment process:

Update build scripts to support ARM architecture.
Ensure CI/CD pipelines can be built and tested in ARM environments.
Docker support

Tip

10. Hardware feature dependencies:
Check if the code(TVM) relies on x86-specific hardware features.
Alternatives may need to be found for ARM.

Tip

11. System calls and OS interactions
If the code makes direct system calls, adjustments may be needed for ARM.

Important

12. Cross-platform testing

Establish comprehensive test suites to ensure the functionality works correctly on ARM.
Conduct performance benchmarking to compare x86 and ARM performance differences.

317787106 · 2024-08-15T05:38:29Z

@halibobo1205 Do you want to support ARM Architecture and latest JVM version at the same time ? Or just support ARM Architecture using JDK8 ?

halibobo1205 · 2024-08-15T07:42:16Z

@317787106 JVM officially supports ARM:

Linux got support in JDK 9 by JEP 237
Windows got support in JDK 16 by JEP 388
Macs got support in JDK 17 by JEP 391

According to Oracle Java SE Support Roadmap, JDK9 and JDK16 are non-LTS, and JDK 17 is LTS. Based on the above information, I propose that ARM support JDK17 as a minimum.

Warning

This is the last planned update of JDK 17 under the NFTC. Updates after September 2024 will be licensed under the Java SE OTN License (OTN) and production use beyond the limited free grants of the OTN license will require a fee.

angrynurd · 2024-08-15T07:58:14Z

Here are some common considerations:

Regarding JDK version compatibility.
You recommend using an ARM-optimized JDK distribution. What specific ARM-optimized JDK distributions do you recommend? What are their performance and stability advantages?

abn2357 · 2024-08-15T10:03:28Z

When is the expected completion time for this work? It sounds like a big project.

halibobo1205 · 2024-08-15T10:40:35Z

@endiaoekoe I propose that ARM support JDK17 as a minimum.

halibobo1205 · 2024-08-15T10:51:21Z

@abn2357 Tron currently only supports JDK 8, based on the above information, JDK17 supports ARM fully, perhaps Tron needs to upgrade JDK17 first, which is another big project.

zeusoo001 · 2024-08-20T06:21:27Z

@halibobo1205 It sounds great, and I look forward to your implementation. I see that there may be subtle differences in floating point precision and behavior between ARM and x86. When supporting it, be sure to ensure data consistency. Also investigate whether there are other places that may cause data inconsistency.

Murphytron · 2024-08-20T06:45:09Z

This issue has been added to the core devs community call #22, welcome to share the latest progress @halibobo1205, and discuss together with @endiaoekoe @tomatoishealthy @317787106 @zeusoo001 @abn2357.

halibobo1205 · 2024-08-20T09:19:41Z

1. JDK version compatibility
After some brief research, I found ARM 64-bit versions of JDK 8 available. cc @endiaoekoe @317787106 @abn2357

Provider	Linux	Mac	Windows	Notes
Oracle	✓	✓	✗	• Official support • Requires payment for commercial use
Eclipse Temurin	✓	✗	✗	• Free OpenJDK • Regularly updated and supported by the Adoptium community
Azul Zulu	✓	✓	✗	• Free OpenJDK • Full enterprise version requires payment
BellSoft Liberica	✓	✓	✗	• Free OpenJDK for all users • Relatively less well-known
Amazon Corretto	✓	✓	✗	• Free OpenJDK • long-term support by Amazon • Amazon runs Corretto internally on thousands of production services

halibobo1205 · 2024-08-20T10:33:44Z

6. Floating-point arithmetic known issues:

x87 FPU for some Java Math
- Don't use x87 FPU on x86-64
- Misuse of Math library without strictfp

Unfortunately, Tron does use Math.pow() for floating-point calculations for the Bancor trading pair in ExchangeProcessor:

private long exchangeToSupply(long balance, long quant) {
    logger.debug("balance: " + balance);
    long newBalance = balance + quant;
    logger.debug("balance + quant: " + newBalance);

    double issuedSupply = -supply * (1.0 - Math.pow(1.0 + (double) quant / newBalance, 0.0005));
    logger.debug("issuedSupply: " + issuedSupply);
    long out = (long) issuedSupply;
    supply += out;

    return out;
  }

  private long exchangeFromSupply(long balance, long supplyQuant) {
    supply -= supplyQuant;

    double exchangeBalance =
        balance * (Math.pow(1.0 + (double) supplyQuant / supply, 2000.0) - 1.0);
    logger.debug("exchangeBalance: " + exchangeBalance);

    return (long) exchangeBalance;
  }

Test case

 @Test
  public void testPow() {
    double x = 29218;
    double q = 4761432;
    double ret = Math.pow(1.0 + x / q, 0.0005);
    double ret2 = StrictMath.pow(1.0 + x / q, 0.0005);

    System.out.printf("%s%n", doubleToHex(ret)); //  3ff000033518c576
    System.out.printf("%s%n", doubleToHex(ret2)); // 3ff000033518c575
    Assert.assertEquals(0, Double.compare(ret, ret2)); // fail in jdk8_X86, success in jdk8_ARM64
  }

  public static String doubleToHex(double input) {
    // Convert the starting value to the equivalent value in a long
    long doubleAsLong = Double.doubleToRawLongBits(input);
    // and then convert the long to a hex string
    return Long.toHexString(doubleAsLong);
  }

Tron Should Use StrictMath to Avoid Cross-Platform Consistency Issues. To help ensure the portability on ARM for Java-Tron, I suggest a new proposal to convert Math to StrictMath. cc @zeusoo001

317787106 · 2024-08-20T10:48:10Z

@halibobo1205 First support JDK8 on mac ARM and then extend to support JDK17 on linux and mac ARM may be smooth.

tomatoishealthy · 2024-08-21T02:55:36Z

JDK version compatibility After some brief research, I found ARM 64-bit versions of JDK 8 available. cc @endiaoekoe @317787106 @abn2357

Does this mean that there is no longer a dependency between ARM architecture upgrade and JDK upgrade?

In addition, TRON only focuses on Oracle JDK, right?

halibobo1205 · 2024-08-21T05:31:46Z

@tomatoishealthy, Oops! I'm sorry. I missed that java-tron currently only supports the Oracle JDK.

I found some issues that seem to indicate that OpenJDK is missing the JavaFX module and that Tron uses javafx.util.Pair , Odyssey-v3.7 released Mar 17, 2020, fix this by #2510. Until then, it seems to work with OpenJDK + openjfx. After that, Tron seems to be able to support OpenJDK. But Tron still claims to support only Oracle JDK8, is there any other reason? cc @317787106 @zeusoo001

halibobo1205 · 2024-08-21T08:42:37Z

halibobo1205 · 2024-08-21T09:07:13Z

8. Third-party dependencies

Ensure all third-party libraries and dependencies support ARM architecture.
Some incompatible dependencies may need to be updated or replaced, including, but not limited to, the following.

protoc-gen-grpc-java: see Update protoc-gen-grpc-java dependency to > v1.26.0 to be able to build in ARM64 #4248
- GreatVoyage-v4.7.3 released Oct 25, 2023, upgraded io.grpc:protoc-gen-grpc-java from 1.9.0 to 1.52.1 for ARM64 architecture by feat(net):update com.google.protobuf and io.grpc version #5254

halibobo1205 · 2024-08-27T04:21:31Z

Warning

This is the last planned update of JDK 17 under the NFTC. Updates after September 2024 will be licensed under the Java SE OTN License (OTN) and production use beyond the limited free grants of the OTN license will require a fee.

To avoid subsequent charges for commercial use, I recommend switching to OpenJDK.

halibobo1205 · 2024-09-04T08:47:06Z

Caution

Strong data consistency and finality
Final data consistency is required for blockchain, and it's usually guaranteed by the world state. Unfortunately, Java-Tron doesn't have a world state.
We need to think about how to ensure final data consistency.

halibobo1205 · 2024-09-05T10:39:36Z

1. JDK version compatibility
Maybe try to support OpenJDK on ARM?

halibobo1205 · 2024-10-22T09:25:58Z

A hard fork solution will be introduced in 4.8.0, switching floating-point calculations from Math to StrictMath.

halibobo1205 · 2024-10-25T09:26:10Z

Currently, java-tron supports both LevelDB and RocksDB. On the ARM architecture, we intend to support only RocksDB, mainly due to the following considerations:

Performance Advantages
RocksDB, built on top of LevelDB, offers enhanced performance, reliability, and advanced features such as multi-threaded execution, compaction optimizations, and support for larger datasets, making it more suitable for high-throughput and low-latency use cases.
Community Support

RocksDB has continuous investment and maintenance from Meta (Facebook)
Official support for RocksDB Java API
RocksDB community is more active in supporting ARM architecture
In comparison, LevelDB's community maintenance is relatively less active

Feature Completeness

RocksDB offers richer features (e.g., column families, transaction support, TtlDB)
Built-in monitoring and performance diagnostic tools are more comprehensive
Provides more flexible configuration options for optimization on ARM architecture

Future Development Trends

RocksDB is more widely used in blockchain domain
Continuously receives performance optimizations and feature updates
More timely support for new hardware features

Ecosystem Integration

Better support for RocksDB in cloud-native environments
Better integration with modern monitoring tools
More mature support for containerized deployment

Hardware Adaptation

Better optimization for new storage devices (e.g., NVMe SSD) in RocksDB
Better utilization of ARM architecture's specific instruction sets
Better support for large memory systems

This will ensure the best database usage experience on ARM architecture.

halibobo1205 · 2024-11-21T02:59:03Z

Mac spends much time on CI test due to this:

RocksDBStore.openDB performance degradation after version 6.27.3 on Mac facebook/rocksdb#11035

halibobo1205 · 2024-11-21T03:04:44Z

Important

On the ARM architecture, we intend to support only RocksDB

When using RocksDB on the CI test, we found that some tests failed due to differences in behavior between LevelDB and RocksDB.

JVM core dump

RocksDB does not throw an error when it is closed
- putData
- deleteData
- getData

NewOF · 2025-04-18T08:50:10Z

Principle of X87 Instruction Simulation

Through relevant information and source code, it is known that the key to calculating $a ^ b$ in pow lies in two instructions, fyl2x and f2xm1. fyl2x is used to calculate $b * \log_2^a$, while f2xm1 is used to calculate $2^a -1$。
Process：
- Assume $y = a^b$，take the logarithm on both sides of the equation $\log_2^y = b * \log_2^a$
- That is $\Large y = 2^{b * \log_2^a}$
- By combining these two instructions, we can calculate $a^b$，
At present, the implementation details of the fyl2x and f2xm1 instructions are still unknown. We will attempt to simulate the calculation through Taylor expansion
$\log_2^x$
- $\log_2^x$ unfolds as $\frac{1}{\ln2}\sum_{n=1}^{\infty}\frac{(-1)^n}{n}(x-1)^n$
$2^x - 1$
- $2^x - 1$ unfolds as $\sum_{n=1}^{\infty}\frac{(x\ln2)^n}{n!}$

Simulation implementation(c++)

Using SoftFloat library, supporting 80 bit extended dual precision（http://www.jhauser.us/arithmetic/SoftFloat.html）
Note 1: Float80 class is implemented based on SoftFloat library encapsulation
Note 2: Due to the special nature of the 48 mismatched data (with a power of 0.0005), the calculation process has been partially simplified

  Float80 ln_coe[] = {
  	Float80(0x3FFF, 0x8000000000000000), // 1.0
  	Float80(0x3FFD, 0xFFFFFFFFFFFFFFFF), // 0.5
  	Float80(0x3FFD, 0xAAAAAAAAAAAAAAAA), // 0.333...
  	Float80(0x3FFC, 0xFFFFFFFFFFFFFFFF), // 0.25
  	Float80(0x3FFC, 0xCCCCCCCCCCCCCCCC), // 0.2
  	Float80(0x3FFC, 0xAAAAAAAAAAAAAAAA), // 0.166...
  };

  Float80 taylor_ln2_float80(Float80 x) {
  	return x -
           x * x * ln_coe[1] +
           x * x * x * ln_coe[2] -
           x * x * x * x * ln_coe[3] +
           x * x * x * x * x * ln_coe[4] -
           x * x * x * x * x * x * ln_coe[5];
  }

  // ln(2)^n/n!
  Float80 exp_coe[] = {
  	Float80(0.69314718055994530942),
  	Float80(0.24022650695910071233),
  	Float80(0.05550410866482157995),
  	Float80(0.00961812910762847716),
  	Float80(0.00133335581464284434),
  };

  Float80 taylor_exp2_float80(Float80 x) {
  	return x * exp_coe[0] +
           x * x * exp_coe[1] +
           x * x * x * exp_coe[2] +
           x * x * x * x * exp_coe[3] +
           x * x * x * x * x * exp_coe[4];
  }

  double taylor_pow2_float80(double x, double y) {
  	Float80 x80(x), y80(y);
  	Float80 ln2(0x3FFE, 0xB17217F7D1CF7BBB);

  	Float80 y_lg2_x = y80 * taylor_ln2_float80(x80 - Float80(1)) / ln2;
    // For the test dataset, since y_lg2_x<1, this step omits the exponentiation of the integer part
  	Float80 exp_y_lg2_x = taylor_exp2_float80(y_lg2_x) + Float80(1);

  	return exp_y_lg2_x.to_double();
  }

Test data(base, exp, expected)

  Data("3ff0192278704be3", 0.0005, "3ff000033518c576"); //  4137160
  Data("3ff000002fc6a33f", 0.0005, "3ff0000000061d86"); //  4065476
  Data("3ff00314b1e73ecf", 0.0005, "3ff0000064ea3ef8"); //  4071538
  Data("3ff0068cd52978ae", 0.0005, "3ff00000d676966c"); //  4109544
  Data("3ff0032fda05447d", 0.0005, "3ff0000068636fe0"); //  4123826
  Data("3ff00051c09cc796", 0.0005, "3ff000000a76c20e"); //  4166806
  Data("3ff00bef8115b65d", 0.0005, "3ff0000186893de0"); //  4225778
  Data("3ff009b0b2616930", 0.0005, "3ff000013d27849e"); //  4251796
  Data("3ff00364ba163146", 0.0005, "3ff000006f26a9dc"); //  4257157
  Data("3ff019be4095d6ae", 0.0005, "3ff0000348e9f02a"); //  4260583
  Data("3ff0123e52985644", 0.0005, "3ff0000254797fd0"); //  4367125
  Data("3ff0126d052860e2", 0.0005, "3ff000025a6cde26"); //  4402197
  Data("3ff0001632cccf1b", 0.0005, "3ff0000002d76406"); //  4405788
  Data("3ff0000965922b01", 0.0005, "3ff000000133e966"); //  4490332
  Data("3ff00005c7692d61", 0.0005, "3ff0000000bd5d34"); //  4499056
  Data("3ff015cba20ec276", 0.0005, "3ff00002c84cef0e"); //  4518035
  Data("3ff00002f453d343", 0.0005, "3ff000000060cf4e"); //  4533215
  Data("3ff006ea73f88946", 0.0005, "3ff00000e26d4ea2"); //  4647814
  Data("3ff00a3632db72be", 0.0005, "3ff000014e3382a6"); //  4766695
  Data("3ff000c0e8df0274", 0.0005, "3ff0000018b0aeb2"); //  4771494
  Data("3ff00015c8f06afe", 0.0005, "3ff0000002c9d73e"); //  4793587
  Data("3ff00068def18101", 0.0005, "3ff000000d6c3cac"); //  4801947
  Data("3ff01349f3ac164b", 0.0005, "3ff000027693328a"); //  4916843
  Data("3ff00e86a7859088", 0.0005, "3ff00001db256a52"); //  4924111
  Data("3ff00000c2a51ab7", 0.0005, "3ff000000018ea20"); //  5098864
  Data("3ff020fb74e9f170", 0.0005, "3ff00004346fbfa2"); //  5133963
  Data("3ff00001ce277ce7", 0.0005, "3ff00000003b27dc"); //  5139389
  Data("3ff005468a327822", 0.0005, "3ff00000acc20750"); //  5151258
  Data("3ff00006666f30ff", 0.0005, "3ff0000000d1b80e"); //  5185021
  Data("3ff000045a0b2035", 0.0005, "3ff00000008e98e6"); //  5295829
  Data("3ff00e00380e10d7", 0.0005, "3ff00001c9ff83c8"); //  5380897
  Data("3ff00c15de2b0d5e", 0.0005, "3ff000018b6eaab6"); //  5400886
  Data("3ff00042afe6956a", 0.0005, "3ff0000008892244"); //  5864127
  Data("3ff0005b7357c2d4", 0.0005, "3ff000000bb48572"); //  6167339
  Data("3ff00033d5ab51c8", 0.0005, "3ff0000006a279c8"); //  6240974
  Data("3ff0000046d74585", 0.0005, "3ff0000000091150"); //  6279093
  Data("3ff0010403f34767", 0.0005, "3ff0000021472146"); //  6428736
  Data("3ff00496fe59bc98", 0.0005, "3ff000009650a4ca"); //  6432355,6493373
  Data("3ff0012e43815868", 0.0005, "3ff0000026af266e"); //  6555029
  Data("3ff00021f6080e3c", 0.0005, "3ff000000458d16a"); //  7092933
  Data("3ff000489c0f28bd", 0.0005, "3ff00000094b3072"); //  7112412
  Data("3ff00009d3df2e9c", 0.0005, "3ff00000014207b4"); //  7675535
  Data("3ff000def05fa9c8", 0.0005, "3ff000001c887cdc"); //  7860324
  Data("3ff0013bca543227", 0.0005, "3ff00000286a42d2"); //  8292427
  Data("3ff0021a2f14a0ee", 0.0005, "3ff0000044deb040"); //  8517311
  Data("3ff0002cc166be3c", 0.0005, "3ff0000005ba841e"); //  8763101
  Data("3ff0000cc84e613f", 0.0005, "3ff0000001a2da46"); //  9269124
  Data("3ff000057b83c83f", 0.0005, "3ff0000000b3a640"); //  9631452

Comparison of Results

exp:0.0005 base:1.00628495434413 3ff019be4095d6ae
expected: 1.0000031326481 3ff0000348e9f02a
result:   1.0000031326481 3ff0000348e9f029

exp:0.0005 base:1.00805230779141 3ff020fb74e9f170
expected: 1.00000401003852 3ff00004346fbfa2
result:   1.00000401003852 3ff00004346fbfa1

At the beginning of the test, the iterative calculation used directly had poor results. Later, by directly calculating the polynomial expansion and using pre calculated coefficients, the effect was significantly improved. (However, there are still two pieces of data that do not fully match expectations)

halibobo1205 · 2025-04-18T09:12:35Z

diff data(48 POW calculation instances) is the result of Math and StrictMath, and if algorithmic simulations are performed, the implementation needs to be fully tested for full equivalence with the Math library.

NewOF · 2025-04-21T03:24:38Z

Using simulated pow to calculate 48 mismatched floating-point data, the best result currently is 2 mismatches.
By comparing historical data (block height of 11 million 673496 pieces of data, including contextual numerical calculations, and comparing the final exchange balance), the simulated implementation of pow has 58072 mismatches compared to Math.pow. At the same time, using the pow of the C++standard library directly for calculation resulted in 48 mismatches. Compared to its implementation, it is consistent with StrictMath.pow (both with 48 mismatches).
By analyzing the relevant resources and source code currently found, it is speculated that the logarithmic and power operations implemented within the x87 instruction should also be polynomial expansions, while utilizing preprocessed coefficients for acceleration. At present, due to the lack of further implementation details and the instability of floating-point operations themselves, it is difficult to achieve accurate matching. In contrast, using hard coding is more feasible.

halibobo1205 · 2025-04-21T04:02:17Z

In the Java HotSpot virtual machine, do_intrinsic is an important concept related to intrinsic functions.

Intrinsics in the HotSpot JVM are special, optimized implementations of commonly used Java methods. When the JVM identifies specific method calls, it may replace the standard Java implementation of these methods with more efficient native code. Math.pow() is one such method that is commonly intrinsically optimized.

Specifically for the Math.pow() method, the HotSpot JVM handles it in the following ways:

The JVM has a function called do_intrinsic that determines whether a method can be replaced with an intrinsic implementation, and how to perform the replacement.
For Math.pow(), when the JIT (Just-In-Time compiler) compiles code containing this method call, the do_intrinsic mechanism examines the call and may replace it with native instructions or optimized algorithms corresponding to the processor architecture.
Typically, the intrinsic implementation of Math.pow() directly utilizes instructions from the CPU's floating-point unit, such as FSIN, FCOS, FPTAN in x86 architecture, or calls to underlying math libraries (like Intel's MKL), thereby avoiding the slower pure Java implementation.

This optimization mechanism is usually managed by the vmIntrinsics namespace in the HotSpot source code, with relevant implementations distributed across files such as src/hotspot/share/classfile/vmSymbols.hpp, src/hotspot/share/opto/library_call.cpp, and others.

do_intrinsic(_dlog, java_lang_Math, log_name, double_double_signature, F_S)

Through this intrinsic function optimization, the HotSpot JVM allows Java programs to maintain platform independence while achieving performance close to native code, which is particularly beneficial for math-computation-intensive applications.

Here's the logic for log, and pow is similar:

317787106 · 2025-04-21T08:21:19Z

@halibobo1205 When Math calculates pow(double,double), how can you determine if the result is inconsistent with that calculated by StrictMath? What to do when inconsistencies are found? And you can specify what's hardcoded.

halibobo1205 · 2025-04-21T09:49:00Z

@317787106

Calculate the bancor transaction based on Math.pow and StrictMath.pow, respectively, in an x86 JDK8 environment, and record if the final buyTokenQuant is inconsistent

public class ExchangeCapsule implements ProtoCapsule<Exchange> {

  public long transaction(byte[] sellTokenID, long sellTokenQuant, boolean useStrictMath) {
    long supply = 1_000_000_000_000_000_000L;
    ExchangeProcessor processor = new ExchangeProcessor(supply, useStrictMath);
    ExchangeProcessor strictProcessor = new ExchangeProcessor(supply, true);

    long buyTokenQuant = 0;
    long strictBuyTokenQuant = 0;
    long firstTokenBalance = this.exchange.getFirstTokenBalance();
    long secondTokenBalance = this.exchange.getSecondTokenBalance();

    if (this.exchange.getFirstTokenId().equals(ByteString.copyFrom(sellTokenID))) {
      buyTokenQuant = processor.exchange(firstTokenBalance,
          secondTokenBalance,
          sellTokenQuant);
      strictBuyTokenQuant = strictProcessor.exchange(firstTokenBalance,
          secondTokenBalance,
          sellTokenQuant);
      if (!useStrictMath && buyTokenQuant != strictBuyTokenQuant) {
        logAndRecord("{}\t{}\t{}\t{}\t{}", buyTokenQuant, strictBuyTokenQuant, firstTokenBalance, secondTokenBalance, sellTokenQuant); // logAndRecord pow data
      }
      this.exchange = this.exchange.toBuilder()
          .setFirstTokenBalance(firstTokenBalance + sellTokenQuant)
          .setSecondTokenBalance(secondTokenBalance - buyTokenQuant)
          .build();
    } else {
      buyTokenQuant = processor.exchange(secondTokenBalance,
          firstTokenBalance,
          sellTokenQuant);
      strictBuyTokenQuant = strictProcessor.exchange(secondTokenBalance,
          firstTokenBalance,
          sellTokenQuant);
      if (!useStrictMath && buyTokenQuant != strictBuyTokenQuant) {
        logAndRecord("{}\t{}\t{}\t{}\t{}", buyTokenQuant, strictBuyTokenQuant,secondTokenBalance, firstTokenBalance, sellTokenQuant); // logAndRecord pow data
      }
      this.exchange = this.exchange.toBuilder()
          .setFirstTokenBalance(firstTokenBalance - buyTokenQuant)
          .setSecondTokenBalance(secondTokenBalance + sellTokenQuant)
          .build();
    }
    
    return buyTokenQuant;
  }

Based on the data collected in step 1, calculate the pow data to be hardcoded: issuedSupply and exchangeBalance

     
public class ExchangeProcessor {

  private long supply;
  private final boolean useStrictMath;

  public ExchangeProcessor(long supply, boolean useStrictMath) {
    this.supply = supply;
    this.useStrictMath = useStrictMath;
  }

  private long exchangeToSupply(long balance, long quant) {
    long newBalance = balance + quant;
    double issuedSupply = -supply * (1.0 - Maths.pow(1.0 + (double) quant / newBalance, 0.0005, this.useStrictMath));
    long out = (long) issuedSupply;
    supply += out;
    return out;
  }

  private long exchangeFromSupply(long balance, long supplyQuant) {
    supply -= supplyQuant;
    double exchangeBalance = balance * (Maths.pow(1.0 + (double) supplyQuant / supply, 2000.0, this.useStrictMath) - 1.0);
    return (long) exchangeBalance;
  }

  public long exchange(long sellTokenBalance, long buyTokenBalance, long sellTokenQuant) {
    long relay = exchangeToSupply(sellTokenBalance, sellTokenQuant);
    return exchangeFromSupply(buyTokenBalance, relay);
  }

}

Adjust StrictMathWrapper

   private static final Map<Double, Double> powData = Collections.synchronizedMap(new HashMap<>());

  public static double pow(double a, double b) {
    double strictResult = StrictMath.pow(a, b);
    return powData.getOrDefault(a, strictResult);
  }
}

halibobo1205 · 2025-04-21T09:53:36Z

If there are other ways to implement X87 Instruction Simulation, please discuss them.
Hard-coding for pow data is currently the better solution based on performance, implementation complexity, and verification difficulty.

317787106 · 2025-04-22T02:37:01Z

@halibobo1205 I noticed that the pow results in exchangeToSupply and exchangeFromSupply are converted to long. Could this precision loss impact the handling of hardcoded special cases?

halibobo1205 · 2025-04-22T15:15:16Z

@halibobo1205 I noticed that the pow results in exchangeToSupply and exchangeFromSupply are converted to long. Could this precision loss impact the handling of hardcoded special cases?

Yes, precision loss precisely reduces the amount of the pow data that needs to be hard-coded.

halibobo1205 · 2025-05-08T08:31:28Z

Principle of X87 Instruction Simulation

USE MPFR mpfr_pow

#include <stdio.h>
#include <gmp.h>
#include <mpfr.h>


// Precision settings
#define X87_PRECISION 64  // 64-bit mantissa for x87 80-bit format
int main(void) {
    mpfr_t base1, exp1, result1;
    mpfr_t base2, exp2, result2;
    mpfr_set_default_prec(X87_PRECISION);
    mpfr_set_default_rounding_mode(MPFR_RNDN);
    mpfr_init2(base1, X87_PRECISION);
    mpfr_init2(exp1, X87_PRECISION);
    mpfr_init2(result1, X87_PRECISION);

    mpfr_init2(base2, X87_PRECISION);
    mpfr_init2(exp2, X87_PRECISION);
    mpfr_init2(result2, X87_PRECISION);

    mpfr_set_d(base1, 1.0061363892207218, MPFR_RNDN);
    mpfr_set_d(exp1, 0.0005, MPFR_RNDN);

    mpfr_set_d(base2, 1.0000046943914231, MPFR_RNDN);
    mpfr_set_d(exp2, 2000, MPFR_RNDN);

    mpfr_pow(result1, base1, exp1, MPFR_RNDN);
    mpfr_pow(result2, base2, exp2, MPFR_RNDN);

    printf("pow(1.0061363892207218, 0.0005) = ");
    mpfr_out_str(stdout, 10, 17, result1, MPFR_RNDN);
    printf("\n");

    printf("pow(1.0000046943914231, 2000) = ");
    mpfr_out_str(stdout, 10, 17, result2, MPFR_RNDN);
    printf("\n");

    mpfr_clear(base1);
    mpfr_clear(exp1);
    mpfr_clear(result1);

    mpfr_clear(base2);
    mpfr_clear(exp2);
    mpfr_clear(result2);
    mpfr_free_cache();
    return 0;
}

❌ Unable to precisely simulate x86 pow

halibobo1205 · 2025-05-08T08:37:49Z

Principle of X87 Instruction Simulation

USE Apfloat

❌ The short answer is "no", see details:

Can emulate the x87 instruction pow operation? mtommila/apfloat#60

halibobo1205 · 2025-05-08T09:43:22Z

cc @NewOF

NewOF · 2025-05-08T10:33:02Z

According to the documentation of the 8087 support library mentioned in the link, it provides a simulated implementation of the relevant instructions. However, since the corresponding source code is not provided, it is not possible to verify the correctness.

halibobo1205 · 2025-05-08T10:37:31Z

Hard-Code:

polkadot for grandpa_support

NewOF · 2025-05-08T10:37:40Z

Some constants mentioned in another link do not provide precise hexadecimal representations; instead, they use decimal floating-point numbers with insufficient precision, which are not very meaningful for our calculation scenario.

halibobo1205 · 2025-05-12T03:59:36Z

refer :

halibobo1205 · 2025-05-12T04:08:07Z

Floating-point arithmetic

Important

For historical data(the Bancor trading pair), Hardcoded Special Cases(48 POW calculation instances)

Hard-coding remains the optimal solution at this stage, though we'll continue exploring alternative approaches. I welcome your thoughts and input on this matter.

halibobo1205 · 2025-05-13T07:42:35Z

All current progress is documented in PR #6327. We welcome any new ideas or suggestions for further improvement.

halibobo1205 · 2025-05-16T13:34:57Z

Milestone Update (2025-05-16)

Current Status: In Coding feat(architecture): support arm64 based on JDK17 #6327. (90% complete, All have passed unit tests)
Next Steps:
- RocksDBStore.openDB performance degradation after version 6.27.3 on Mac facebook/rocksdb#11035, Fix Mac spends much time on CI test
- Code review submission by 2025-05-30
Owner: @halibobo1205

halibobo1205 · 2025-05-23T09:34:12Z

Milestone Update (2025-05-23)

discussion progress: Core Devs Community Call 38
Owner: @halibobo1205

abc-x-t · 2025-05-26T06:54:20Z

Cross-platform testing

Establish comprehensive test suites to ensure the functionality works correctly on ARM.

Conduct performance benchmarking to compare x86 and ARM performance differences.

@halibobo1205 Hi, What's the current progress on this test? Looking forward to seeing the test results for this part, as this outcome is quite important.

halibobo1205 · 2025-05-26T10:09:04Z

Cross-platform testing

@abc-x-t

Test suites are passed on MACOS and Linux for arm64
Conduct performance benchmarking to compare x86 and ARM performance differences: Performance benchmarks based on GravitonV2, Intel Xeon, and AMD EPYC are ready to begin, and the cloud provider is AWS.

abc-x-t · 2025-05-28T03:02:58Z

Cross-platform testing

@abc-x-t

Test suites are passed on MACOS and Linux for arm64[ ] Conduct performance benchmarking to compare x86 and ARM performance differences: Performance benchmarks based on GravitonV2, Intel Xeon, and AMD EPYC are ready to begin, and the cloud provider is AWS.

Great! Do we have a detailed schedule for the remaining work on this feature before it’s released? And is there any way the community can get involved in reviewing or testing?

halibobo1205 · 2025-05-29T02:51:31Z

@abc-x-t Performance benchmarks based on GravitonV2, Intel Xeon, and AMD EPYC are underway! Welcome to review and testing in #6327

abc-x-t · 2025-05-29T10:11:10Z

@abc-x-t Performance benchmarks based on GravitonV2, Intel Xeon, and AMD EPYC are underway! Welcome to review and testing in #6327

OK, thanks for the information! Maybe we can add more tests to cover more scenarios.
For example, performance between JDK8 and JDK17 on x86. Tests under opcode level described in #6292

halibobo1205 added the type:feature label Aug 15, 2024

halibobo1205 mentioned this issue Aug 15, 2024

Core Devs Community Call 22 tronprotocol/pm#99

Closed

This was referenced Aug 28, 2024

Upgrade to JDK 17 for ARM Architecture #5976

Open

Core Devs Community Call 23 tronprotocol/pm#100

Closed

halibobo1205 mentioned this issue Oct 30, 2024

TIP-697: Migrate Floating-Point Calculations from Math to StrictMath tronprotocol/tips#697

Closed

halibobo1205 linked a pull request May 13, 2025 that will close this issue

feat(architecture): support arm64 based on JDK17 #6327

Draft

halibobo1205 mentioned this issue May 22, 2025

Core Devs Community Call 38 tronprotocol/pm#136

Closed

yuekun0707 added the v4.8.1 label May 27, 2025

yuekun0707 assigned halibobo1205 May 28, 2025

abc-x-t mentioned this issue May 29, 2025

Build a benchmark of data growth impact on TVM opcodes #6292

Open

Expand ARM Architecture Compatibility #5954

Expand ARM Architecture Compatibility #5954

Comments

halibobo1205 commented Aug 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Background

Key developments in ARM architecture:

ARM advantages:

Related Issues and PRs

Scope of Impact

Current Progress Summary

angrynurd commented Aug 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tomatoishealthy commented Aug 15, 2024

Uh oh!

halibobo1205 commented Aug 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

317787106 commented Aug 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

halibobo1205 commented Aug 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

angrynurd commented Aug 15, 2024

Uh oh!

abn2357 commented Aug 15, 2024

Uh oh!

halibobo1205 commented Aug 15, 2024

Uh oh!

halibobo1205 commented Aug 15, 2024

Uh oh!

zeusoo001 commented Aug 20, 2024

Uh oh!

Murphytron commented Aug 20, 2024

Uh oh!

halibobo1205 commented Aug 20, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

halibobo1205 commented Aug 20, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

317787106 commented Aug 20, 2024

Uh oh!

tomatoishealthy commented Aug 21, 2024

Uh oh!

halibobo1205 commented Aug 21, 2024

Uh oh!

halibobo1205 commented Aug 21, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

halibobo1205 commented Aug 21, 2024

Uh oh!

halibobo1205 commented Aug 27, 2024

Uh oh!

halibobo1205 commented Sep 4, 2024

Uh oh!

halibobo1205 commented Sep 5, 2024

Uh oh!

halibobo1205 commented Oct 22, 2024

Uh oh!

halibobo1205 commented Oct 25, 2024

Uh oh!

halibobo1205 commented Nov 21, 2024

Uh oh!

halibobo1205 commented Nov 21, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NewOF commented Apr 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Principle of X87 Instruction Simulation

Simulation implementation(c++)

Uh oh!

halibobo1205 commented Apr 18, 2025

Uh oh!

NewOF commented Apr 21, 2025

Uh oh!

halibobo1205 commented Apr 21, 2025

Uh oh!

halibobo1205 commented Aug 15, 2024 •

edited

Loading

angrynurd commented Aug 15, 2024 •

edited

Loading

halibobo1205 commented Aug 15, 2024 •

edited

Loading

317787106 commented Aug 15, 2024 •

edited

Loading

halibobo1205 commented Aug 15, 2024 •

edited

Loading

halibobo1205 commented Aug 20, 2024 •

edited

Loading

halibobo1205 commented Aug 20, 2024 •

edited

Loading

halibobo1205 commented Aug 21, 2024 •

edited

Loading

halibobo1205 commented Nov 21, 2024 •

edited

Loading

NewOF commented Apr 18, 2025 •

edited

Loading

halibobo1205 commented May 8, 2025 •

edited

Loading