data-aware scheduling makes wrong dataset updates and downstream dag runs mapping when there are multiple updates during the execution of downstream dag #49805
Labels
area:core
kind:bug
This is a clearly a bug
needs-triage
label for new issues that we didn't triage yet
priority:medium
Bug that should be fixed before next release but would not block a release
Apache Airflow version
2.10.5
If "Other Airflow 2 version" selected, which one?
No response
What happened?
I created a demo procedure where the upstream dag updates the dataset continuously, while the downstream dag just sleep 30s to simulate some time-consume tasks.
however, the dataset update and downstream dag run mappings are wired. it looks like below and there would be dangling dataset updates if I stop the upstream dag.
the dangling dataset updates are actually processed by downstream dag but it showed in ui that it would never be processed.
What you think should happen instead?
How to reproduce
Operating System
ubuntu
Versions of Apache Airflow Providers
No response
Deployment
Other 3rd-party Helm chart
Deployment details
No response
Anything else?
No response
Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: