Skip to content

'native-image' compilation process crashes very rarely #3245

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jiekang opened this issue Feb 25, 2021 · 7 comments
Closed

'native-image' compilation process crashes very rarely #3245

jiekang opened this issue Feb 25, 2021 · 7 comments
Assignees

Comments

@jiekang
Copy link
Collaborator

jiekang commented Feb 25, 2021

Describe the issue

When using native-image to compile a binary, the JVM process crashes very rarely:

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007f42b9c53260, pid=718926, tid=718974
#
# JRE version: OpenJDK Runtime Environment GraalVM 21.1.0-dev (11.0.10+9) (build 11.0.10+9-jvmci-21.1-b01)
# Java VM: OpenJDK 64-Bit Server VM GraalVM 21.1.0-dev (11.0.10+9-jvmci-21.1-b01, mixed mode, tiered, jvmci, compressed oops, parallel gc, linux-amd64)
# Problematic frame:
# V  [libjvm.so+0x5db260]  ClassLoaderData::is_builtin_class_loader_data() const+0x0
#

Steps to reproduce the issue

  1. Download EventCommit.java from: https://gist.github.com/jiekang/c2dd7851de31b88e7c6feb9453224303
  2. Compile with labsjdk11: javac EventCommit.java
  3. Build substratevm: mx build
  4. Build native image:
    mx native-image -ea -esa --no-fallback -H:+AllowVMInspection EventCommit

Maybe more likely with:
`mx native-image -ea -esa --no-fallback -H:+AllowVMInspection "-J-XX:FlightRecorderOptions=retransform=false" EventCommit

And seen with both:
#3155 : mx native-image -ea -esa --no-fallback -H:+AllowVMInspection EventCommit
#3070 mx native-image -ea -esa --no-fallback -H:+FlightRecorder EventCommit

Describe GraalVM and your environment:

  • GraalVM version: 495ad03
  • JDK major version: 11: labsjdk-ce-11.0.10-jvmci-21.1-b01
  • OS: linux
  • Architecture: amd64

More details

hs_err:
https://gist.github.com/jiekang/2a3064b0e9c623bc5e0aa32edf20eebf

Edit:
2021-03-03: Updated steps to reproduce issue with code

@dougxc
Copy link
Member

dougxc commented Mar 3, 2021

The crash is happening in a stack that contains com.redhat.jfr.events.StringEvent.<clinit>()V+2 so there may be something about that class initializer being called in an unexpected JVM state?

j  jdk.jfr.FlightRecorder.register(Ljava/lang/Class;)V+20 [email protected]
j  com.redhat.jfr.events.StringEvent.<clinit>()V+2
v  ~StubRoutines::call_stub
V  [libjvm.so+0x88dcd6]  JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*)+0x366
V  [libjvm.so+0x85edae]  InstanceKlass::call_class_initializer(Thread*)+0x22e
V  [libjvm.so+0x85f463]  InstanceKlass::initialize_impl(Thread*)+0x563
V  [libjvm.so+0xe56cf0]  Unsafe_EnsureClassInitialized0+0xf0
J 2323  jdk.internal.misc.Unsafe.ensureClassInitialized0(Ljava/lang/Class;)V [email protected] (0 bytes) @ 0x00007f88d6bee0ff [0x00007f88d6bee040+0x00000000000000bf]
J 2717 c1 com.oracle.svm.hosted.classinitialization.ConfigurableClassInitialization.ensureClassInitialized(Ljava/lang/Class;Z)Lcom/oracle/svm/hosted/classinitialization/InitKind; (160 bytes) @ 0x00007f88cfaab1c4 [0x00007f88cfaab060+0x0000000000000164]

@dougxc dougxc assigned christianwimmer and unassigned dougxc Mar 3, 2021
@jiekang
Copy link
Collaborator Author

jiekang commented Mar 3, 2021

I'll update with full reproducer instructions and include the code shortly.

@jiekang
Copy link
Collaborator Author

jiekang commented Mar 3, 2021

I've edited the steps to reproduce and included a gist with minimal code. It sounds like an issue with Hotspot transformation of event classes for JFR. This is pretty niche; I'm not sure now if this bug report is appropriate for upstream at this exact moment since the JFR work is still in progress @ https://github.com/native-image-jfr/graal

@christianwimmer
Copy link

Since the crash is in the JDK module system code

V  [libjvm.so+0x5d8700]  ClassLoaderData::is_builtin_class_loader_data() const+0x0
V  [libjvm.so+0xc160a2]  Modules::add_reads_module(_jobject*, _jobject*, Thread*)+0x242
V  [libjvm.so+0x93a4ae]  JVM_AddReadsModule+0x4e
j  java.lang.Module.addReads0(Ljava/lang/Module;Ljava/lang/Module;)V+0 [email protected]

it could also be just a generic JDK bug, where the HotSpot code needs to be made more defensive. So maybe file a OpenJDK bug for this?

@adinn
Copy link
Collaborator

adinn commented Mar 4, 2021

It looks to me like this happens when Modules::add_reads_module(from_module, to_module) calls from_module->add_read(to_module) which in turn calls this->set_read_walk_required(to_module->loader_data()). This last call effectively executes to_module->loader_data()->is_builtin_class_loader_data(). Since by that point we know to_module cannot be null the SEGV indicates that to_module->loader_data() is NULL.

Now normally when a module is created (by ModuleEntryTable::locked_create_entry) there is a prior assert that the field _class_loader_data is non-NULL. So, it's unlikely that in this case a NUL Lvalue has slipped through. The only case where that method is not used to create a module entry is when the java.base module is created. It's _class_loader_data may be temporarily null, up until ClassLoaderData::init_null_class_loader_data() gets called. It looks rather too late for that to be happening here but ...

Running with module debug logging enabled might clarify things -- that requires a debug build of the JVM.

@jiekang
Copy link
Collaborator Author

jiekang commented Jul 7, 2021

Sorry I haven't been tracking this very well. I ended up opening a bug against OpenJDK to track this as an investigation from
the OpenJDK side.

https://bugs.openjdk.java.net/browse/JDK-8265071

But now I realize it got closed with a reference back to this issue, hahah......

I'll see if I can allocate cycles of my own to debug this.

@wirthi
Copy link
Member

wirthi commented May 6, 2025

Closing this older ticket with no updates in 4 years.

I don't see any similar crash in our CI. If this is still relevant, please reopen, I suspect it has long been fixed.

@wirthi wirthi closed this as completed May 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants