Skip to content

Generalize parsing of externalized type forms #428

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 12 commits into
base: code-reflection
Choose a base branch
from

Conversation

mcimadamore
Copy link
Collaborator

@mcimadamore mcimadamore commented May 15, 2025

This draft PR contains some preliminary work to generalize parsing of externalized type forms, as well as to making Java type string in code models more readable.


Progress

  • Change must not contain extraneous whitespace

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/babylon.git pull/428/head:pull/428
$ git checkout pull/428

Update a local copy of the PR:
$ git checkout pull/428
$ git pull https://git.openjdk.org/babylon.git pull/428/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 428

View PR using the GUI difftool:
$ git pr show -t 428

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/babylon/pull/428.diff

@@ -2679,7 +2679,7 @@ FunctionType typeToFunctionType(Type t) {
}

RecordTypeRef symbolToRecordTypeRef(Symbol.ClassSymbol s) {
TypeElement recordType = typeToTypeElement(s.type);
TypeElement recordType = typeToTypeElement(s.erasure(types));
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new grammar for record references wants an erased type identifier here, so I tweaked the code accordingly.

@bridgekeeper
Copy link

bridgekeeper bot commented May 15, 2025

👋 Welcome back mcimadamore! A progress list of the required criteria for merging this PR into code-reflection will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented May 15, 2025

❗ This change is not yet ready to be integrated.
See the Progress checklist in the description for automated requirements.

String typeArgs = hasTypeArguments() ?
typeArguments().stream().map(JavaType::toString)
.collect(Collectors.joining(", ", "<", ">")) :
"";
return String.format("%s%s%s", prefix, name, typeArgs);
}

static String escape(String s) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting (see also ExternalizedType::toString): if the type identifier is not a valid Java identifier, we need to wrap it in quotes -- otherwise we might have parsing issues when doing a round-trip. This comes up e.g. in local class names, which start with a number.

@@ -167,6 +168,11 @@ public sealed interface JavaType extends TypeElement permits ClassType, ArrayTyp
*/
ClassDesc toNominalDescriptor();

@Override
default ExternalizedTypeElement externalize() {
return DescParser.parseExTypeElem(toString());
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is perhaps the main issue with this approach (see similar code in JavaRef). That is, while we can parse back textual representation that look like Java types seamlessly, the toString of the parsed ExternalizedTypeElement doesn't look like the original text.

As an example, consider the following type string:

Foo<? extends Bar>

This is parsed correctly. But if we print the toString of the (parsed) externalized type, we get:

Foo<extends<?, Bar>>

In other words, since at the externalized type element level it's all about type application, when we render the external form we no longer know what was an infix operator vs. prefix (and also whether ( or < should be used).

This means that we can't just render the toString of javaType.externalize() -- as that will not look like Java types at all. Instead, we need to render the javaType.toString() -- in other words, the toString of a Java types "happens" to be parseable back as an externalized form.

// '?' 'extends' JavaType // covariant type argument
// '?' 'super' JavaType // contravariant type argument
// JavaType
JavaType constructType(JavaType qualifier, ExternalizedTypeElement tree) {
Copy link
Collaborator Author

@mcimadamore mcimadamore May 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As can be seen, the code here is quite complex. This is due to the fact that there's no general 1-1 mapping between type operators and JavaType forms. This means that when we parse an ExternalizedTypeElement back into a JavaType we have to look hard at its structure, to understand what it is, which leads to the spaghetti code in this class.

// JavaType '::' ident ':' JavaType // field reference
// JavaType '::' ident '(' JavaType* ')' ':' JavaType // method reference
// JavaType '::' '(' JavaType* ')' // constructor reference
// ident '(' RecordComponent* ')' // record reference
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to change this slightly because the original production was:

'(' JavaType* ')' JavaType

But we cannot express that combination using the generalized grammar we have for externalized types (because the last JavaType occurs in "postfix" position).

@@ -121,8 +122,7 @@
*/
public final class OpParser {

static final TypeElement.ExternalizedTypeElement VOID =
new TypeElement.ExternalizedTypeElement("void", List.of());
static final TypeElement.ExternalizedTypeElement VOID = JavaType.VOID.externalize();
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same workaround we discussed for the other attempt.

// JavaType '::' JavaRefParams // constructor reference
// ident '(' RecordComponent* ')' // record reference
//
// JavaRefParams:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an issue here: if there's no parameter (e.g. the method/constructor takes no args), then we have no externalized type arguments. This means that if we print that, we lose the <> and we cannot parse it back again (it will look like a field ref, not a method ref). To address this I've applied the same trick used in C -- where an empty parameter list is denoted as (void).

// ExSymbol:
// '::'
// ':'
// '?'
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I initially listed -> here -- but I quickly ran into an issue where parsing a func op would fail, because OpWriter also uses the -> token to separate the return type of the func op from the body (e.g. the ex type parser for the return type of the func op ends up consuming tokens that are not meant for it).

// ClassType // class type
// PrimitiveType // primitive type
// TypeVar // type variable
// '[' JavaType ']' // array type
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lack of a post-fix form is forcing us down some suboptimal path. I picked this mostly because it's succinct to read/write (although probably quite surprising).

// ClassTypeNoPackage:
// ident // simple class type
// ident '<' TypeArg* '>' // parameterized class type
// ClassType '::' ClassType // nested class type
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't use $ here -- because $ can be the start of a Java identifier -- so if we have, say Foo$Bar the parser will read that as a single identifier.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

1 participant