-
Notifications
You must be signed in to change notification settings - Fork 25.3k
ESQL: Fix some more usages of field attribute names in LOOKUP JOIN #129355
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ESQL: Fix some more usages of field attribute names in LOOKUP JOIN #129355
Conversation
In LOOKUP JOIN, union type fields don't work as match fields because we extract their names wrong. Correct this and make MatchConfig rely on FieldName to avoid this problem going forward.
Pinging @elastic/es-analytical-engine (Team:Analytics) |
private MatchConfig(FieldAttribute match, Layout.ChannelAndType input) { | ||
// Note, this handles TEXT fields with KEYWORD subfields | ||
this(match.exactAttribute().name(), input.channel(), input.type()); | ||
this(match.exactAttribute().fieldName(), input.channel(), input.type()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not quite obvious to me how the keyword subfield of a text field works with multi-typed field, it will be great if we have a test to exercise this code path, or some comments for when match.exactAttribute().name()
is different from match.exactAttribute().fieldName()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly, I think this line is super redundant. The matchfield is a field from the lookup index. We can't even use TEXT as a lookup key in the lookup index, it's only allowed in the main index. So the dropping down to the exact subfield just doesn't ever apply.
I'll expand the comment. Testing this code path is currently not really possible (at least not in a realistic scenario.)
In any case, using fieldName
over name
is going to be more correct once the call to exactAttribute
stops being redundant.
# union type behavior | ||
############################################### | ||
|
||
joinOnMultiTypedMatchFieldCastToInteger |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding tests for multi-typed numeric fields as join keys. We still have gaps in test coverage for multi-typed join keys involving string types (keyword, text, ip) and date types (millis, nanos). If these are not within the scope of this PR, they should be covered in a separate one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a more complete list of tests to LookupJoinTypesIT
. It's still not exhaustive but IMO covers enough ground for now.
That required a bit of refactoring ^^
Use names that are easier to follow and require less explanation.
This reverts commit a3da244.
// Union types; non-exhaustive and can be extended | ||
{ | ||
TestConfigs configs = testConfigurations.computeIfAbsent("union-types", TestConfigs::new); | ||
configs.addUnionTypePasses(SHORT, INTEGER, INTEGER); | ||
configs.addUnionTypePasses(BYTE, DOUBLE, LONG); | ||
configs.addUnionTypePasses(DATETIME, DATE_NANOS, DATE_NANOS); | ||
configs.addUnionTypePasses(DATE_NANOS, DATETIME, DATETIME); | ||
configs.addUnionTypePasses(SCALED_FLOAT, HALF_FLOAT, DOUBLE); | ||
configs.addUnionTypePasses(TEXT, KEYWORD, KEYWORD); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Union-type test cases added here!
@@ -270,16 +241,16 @@ protected Collection<Class<? extends Plugin>> nodePlugins() { | |||
Set<String> knownTypes = new HashSet<>(); | |||
for (TestConfigs configs : testConfigurations.values()) { | |||
for (TestConfig config : configs.configs.values()) { | |||
if (knownTypes.contains(config.indexName())) { | |||
throw new IllegalArgumentException("Duplicate index name: " + config.indexName()); | |||
if (knownTypes.contains(config.lookupIndexName())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, we only enforce uniqueness for lookupIndexName
. If this restriction is intended to apply only to lookup indices, it might be helpful to update the error message to make that explicit—for example:
"Duplicate lookup index name:"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll update the message!
I think long term, we should refactor the test setup as it creates too many indices. We only really need 1 main index (with a field for every relevant type), 1 lookup index (again, 1 field for every relevant type) and only additional indices for union type cases (because we really need mapping conflicts there).
That would simplify the index creation, validation etc. I initially wanted to do this change in this PR, but I already introduced a lot of changes so that's better to do another time IMHO.
...esql/src/internalClusterTest/java/org/elasticsearch/xpack/esql/action/LookupJoinTypesIT.java
Outdated
Show resolved
Hide resolved
...esql/src/internalClusterTest/java/org/elasticsearch/xpack/esql/action/LookupJoinTypesIT.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding more tests around union-typed fields, @alex-spies! I’ve added a few comments in LookupJoinTypesIT
, mostly trying to improve readability.
Co-authored-by: Nhat Nguyen <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you @alex-spies !
💔 Backport failed
You can use sqren/backport to manually backport by running |
…lastic#129355) * Add a test that demonstrates that LOOKUP JOIN works with union types. * Fix usage of FieldAttribute#name where we should use FieldAttribute#fieldName in LOOKUP JOIN. * Refactor LookupJoinTypesIT This doesn't fix any bugs per se, but will avoid bwc problems once we propagate union type field attributes into Join LogicalPlan nodes. (cherry picked from commit 65cbccc) # Conflicts: # x-pack/plugin/esql/src/internalClusterTest/java/org/elasticsearch/xpack/esql/action/LookupFromIndexIT.java # x-pack/plugin/esql/src/internalClusterTest/java/org/elasticsearch/xpack/esql/action/LookupJoinTypesIT.java # x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/fulltext/FullTextFunction.java
…lastic#129355) * Add a test that demonstrates that LOOKUP JOIN works with union types. * Fix usage of FieldAttribute#name where we should use FieldAttribute#fieldName in LOOKUP JOIN. * Refactor LookupJoinTypesIT This doesn't fix any bugs per se, but will avoid bwc problems once we propagate union type field attributes into Join LogicalPlan nodes. (cherry picked from commit 65cbccc) # Conflicts: # x-pack/plugin/esql/src/internalClusterTest/java/org/elasticsearch/xpack/esql/action/LookupJoinTypesIT.java
💚 All backports created successfully
Questions ?Please refer to the Backport tool documentation |
…lastic#129355) * Add a test that demonstrates that LOOKUP JOIN works with union types. * Fix usage of FieldAttribute#name where we should use FieldAttribute#fieldName in LOOKUP JOIN. * Refactor LookupJoinTypesIT This doesn't fix any bugs per se, but will avoid bwc problems once we propagate union type field attributes into Join LogicalPlan nodes. (cherry picked from commit 65cbccc) # Conflicts: # x-pack/plugin/esql/src/internalClusterTest/java/org/elasticsearch/xpack/esql/action/LookupFromIndexIT.java # x-pack/plugin/esql/src/internalClusterTest/java/org/elasticsearch/xpack/esql/action/LookupJoinTypesIT.java # x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/fulltext/FullTextFunction.java
…129355) (#129577) * Add a test that demonstrates that LOOKUP JOIN works with union types. * Fix usage of FieldAttribute#name where we should use FieldAttribute#fieldName in LOOKUP JOIN. * Refactor LookupJoinTypesIT This doesn't fix any bugs per se, but will avoid bwc problems once we propagate union type field attributes into Join LogicalPlan nodes. (cherry picked from commit 65cbccc) # Conflicts: # x-pack/plugin/esql/src/internalClusterTest/java/org/elasticsearch/xpack/esql/action/LookupFromIndexIT.java # x-pack/plugin/esql/src/internalClusterTest/java/org/elasticsearch/xpack/esql/action/LookupJoinTypesIT.java # x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/fulltext/FullTextFunction.java
…129355) (#129579) * Add a test that demonstrates that LOOKUP JOIN works with union types. * Fix usage of FieldAttribute#name where we should use FieldAttribute#fieldName in LOOKUP JOIN. * Refactor LookupJoinTypesIT This doesn't fix any bugs per se, but will avoid bwc problems once we propagate union type field attributes into Join LogicalPlan nodes. (cherry picked from commit 65cbccc) # Conflicts: # x-pack/plugin/esql/src/internalClusterTest/java/org/elasticsearch/xpack/esql/action/LookupFromIndexIT.java # x-pack/plugin/esql/src/internalClusterTest/java/org/elasticsearch/xpack/esql/action/LookupJoinTypesIT.java # x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/fulltext/FullTextFunction.java
…129355) (#129578) * Add a test that demonstrates that LOOKUP JOIN works with union types. * Fix usage of FieldAttribute#name where we should use FieldAttribute#fieldName in LOOKUP JOIN. * Refactor LookupJoinTypesIT This doesn't fix any bugs per se, but will avoid bwc problems once we propagate union type field attributes into Join LogicalPlan nodes. (cherry picked from commit 65cbccc) # Conflicts: # x-pack/plugin/esql/src/internalClusterTest/java/org/elasticsearch/xpack/esql/action/LookupJoinTypesIT.java
This is not a bug fix - it's future-proofing.
LOOKUP JOIN
with union types already works fine due to the fact that the union type field attributes never make it into theJoin
node. E.g. forFROM indices* | EVAL field = field::integer
with a multi-typedfield
we create a field attribute called$$field$$converted_to$integer
- but if we add the commandLOOKUP JOIN lookup_index ON field
,field
will refer to the reference attributefield
created in theEVAL
.However, our implementation of LOOKUP JOIN still relies on
FieldAttribute#name
rather thanFieldAttribute#fieldName
when telling the compute engine which physical index fields to use. The latter is always correct, the former is wrong for union types. To be able to propagate union type field attributes into the join, we need to useFieldAttribute#fieldName
- and that's what this PR does.This is relevant for when
LOOKUP JOIN
gets generalized and starts using proper expressions, likeLOOKUP JOIN lookup_index ON match_field::keyword == other_field
.Also, it's beneficial not to rely on
FieldAttribute#name
when telling the compute engine which fields to fetch from the lookup index. It means we can handily rename the field attributes while still referring to the same physical fields.