Skip to content

[HUDI-9206] Support reading inflight instants with HoodieLogRecordReader #13010

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Apr 2, 2025

Conversation

lokeshj1703
Copy link
Contributor

Change Logs

The PR adds capability to read inflight instants with HoodieLogRecordReader by adding allowInflightInstants flag. The capability is required for updating RLI since RLI reads the log records and finds deleted keys.
While replacing HoodieMergedLogRecordScanner with new reader class, we need to use the allowInflightInstants flag.

Impact

NA

Risk level (write none, low medium or high below)

low

Documentation Update

NA

Contributor's checklist

  • Read through contributor's guide
  • Change Logs and Impact were stated clearly
  • Adequate tests were added if applicable
  • CI passed

@github-actions github-actions bot added the size:M PR with lines of changes in (100, 300] label Mar 21, 2025
Copy link
Contributor

@nsivabalan nsivabalan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add a test for this new arg w/ the FGR

@linliu-code
Copy link
Contributor

@lokeshj1703 , the context is not clear. Can you explain more on "since RLI reads the log records and finds deleted keys."?

yihua
yihua previously requested changes Mar 26, 2025
commitToTable(dataGen.generateInserts("001", 100), INSERT.value, writeConfigs)
validateOutputFromFileGroupReader(getStorageConf, getBasePath, dataGen.getPartitionPaths, true, 0, recordMergeMode)

commitToTable(dataGen.generateUniqueUpdates("003", 100), UPSERT.value, writeConfigs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you prepare the table instead of transactions? Also, could this validation be added to existing tests, i.e., adding a new step in existing tests to remove completed detlacommit from the timeline so that data files become inflight and should not be read? So that we don't need this new test and we can have more coverage with different permutations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added a UT now. This test has been removed.

Copy link
Contributor

@danny0405 danny0405 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add more doc on the flag. The inflight data files should never be seen based on reader side from SI trasanction semantics.

@hudi-bot
Copy link

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

@nsivabalan
Copy link
Contributor

@danny0405 : are you good w/ the patch. If yes, can we go ahead and land the patch.

@nsivabalan
Copy link
Contributor

image

@danny0405
Copy link
Contributor

I'm fine with the change.

@nsivabalan nsivabalan dismissed yihua’s stale review April 2, 2025 14:17

We are going ahead for now to get this into 1.0.2. Feel free to review this async. author can address it in a follow up patch.

@nsivabalan nsivabalan merged commit 1f52b4e into apache:master Apr 2, 2025
59 of 60 checks passed
voonhous pushed a commit to voonhous/hudi that referenced this pull request Apr 8, 2025
voonhous pushed a commit to voonhous/hudi that referenced this pull request Apr 9, 2025
voonhous pushed a commit to voonhous/hudi that referenced this pull request Apr 11, 2025
voonhous pushed a commit to voonhous/hudi that referenced this pull request Apr 15, 2025
voonhous pushed a commit to voonhous/hudi that referenced this pull request Apr 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-1.0.2 size:M PR with lines of changes in (100, 300]
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants