-
Notifications
You must be signed in to change notification settings - Fork 4.3k
Insights: apache/beam
Overview
Could not load contribution data
Please try again later
53 Pull requests merged by 18 people
-
Fix IndexOutOfBoundsException invoking expansion service
#34428 merged
Apr 24, 2025 -
Update contributor doc reflecting #33954
#34640 merged
Apr 24, 2025 -
Skip this failed test until further investigation is done.
#34737 merged
Apr 24, 2025 -
Bump github.com/go-jose/go-jose/v4 from 4.0.4 to 4.0.5 in /sdks
#34735 merged
Apr 24, 2025 -
Update RunInference type hints to PEP 585 standards
#34719 merged
Apr 24, 2025 -
Bump github.com/nats-io/nats.go from 1.41.0 to 1.41.2 in /sdks
#34726 merged
Apr 24, 2025 -
Bump google.golang.org/grpc from 1.71.1 to 1.72.0 in /sdks
#34729 merged
Apr 24, 2025 -
Update code-change-guide.md
#34697 merged
Apr 24, 2025 -
Fix Iceberg Integration tests
#34686 merged
Apr 24, 2025 -
[KafkaIO] Update tracker and watermark for non-visible progress
#34202 merged
Apr 24, 2025 -
Bump cloud.google.com/go/bigtable from 1.36.0 to 1.37.0 in /sdks
#34730 merged
Apr 24, 2025 -
Bump github.com/aws/aws-sdk-go-v2/feature/s3/manager from 1.17.67 to 1.17.73 in /sdks
#34727 merged
Apr 24, 2025 -
Revert "Bump @octokit/request, @actions/github and @octokit/rest in /scripts/ci/pr-bot"
#34725 merged
Apr 24, 2025 -
[extensions-proto] Return correct TypeDescriptor for ProtoCoder (#34561)
#34562 merged
Apr 24, 2025 -
Bump @octokit/request, @actions/github and @octokit/rest in /scripts/ci/pr-bot
#34545 merged
Apr 23, 2025 -
Update beam_PreCommit_Java.yml
#34685 merged
Apr 23, 2025 -
Add support and unit test for PyOD models
#34709 merged
Apr 23, 2025 -
Bump github.com/aws/aws-sdk-go-v2/config from 1.29.10 to 1.29.14 in /sdks
#34610 merged
Apr 23, 2025 -
Add a md5sum check before overwriting a cached prism file
#34715 merged
Apr 23, 2025 -
Bump github.com/avast/retry-go/v4 from 4.6.0 to 4.6.1 in /sdks
#34373 merged
Apr 23, 2025 -
Bump github.com/testcontainers/testcontainers-go from 0.35.0 to 0.36.0 in /sdks
#34642 merged
Apr 23, 2025 -
Bump github.com/aws/aws-sdk-go-v2/credentials from 1.17.64 to 1.17.67 in /sdks
#34634 merged
Apr 23, 2025 -
Update run_rc_validation_python_yaml.yml
#34675 merged
Apr 23, 2025 -
Bump cloud.google.com/go/spanner from 1.78.0 to 1.79.0 in /sdks
#34609 merged
Apr 23, 2025 -
Bump golang.org/x/crypto from 0.21.0 to 0.35.0 in /.test-infra/mock-apis
#34632 merged
Apr 23, 2025 -
Bump golang.org/x/net from 0.36.0 to 0.38.0 in /learning/katas/go
#34650 merged
Apr 23, 2025 -
Phase 1sh - more yaml integration tests
#34692 merged
Apr 23, 2025 -
Add schema definition for yaml test specs.
#34710 merged
Apr 23, 2025 -
Use language connector in alloydb colab.
#34713 merged
Apr 22, 2025 -
Updated the container doc
#34682 merged
Apr 22, 2025 -
Fix input validation for non-chain YAML-provider transforms.
#34701 merged
Apr 22, 2025 -
Add core and missing transformers for yaml integration testing - phase 2sh
#34654 merged
Apr 22, 2025 -
[BigQueryIO] support MicrosInstant type (for xlang)
#34707 merged
Apr 22, 2025 -
Add some more cloudpickle tests.
#34683 merged
Apr 22, 2025 -
Add new resource hints: max active bundles per worker
#34529 merged
Apr 22, 2025 -
add contributing dev doc for yaml integration tests
#34636 merged
Apr 22, 2025 -
Properly propagate schemas of Beam YAML Partition transform.
#34595 merged
Apr 22, 2025 -
fixed jdbc io
#34704 merged
Apr 22, 2025 -
Add adapters to address output conversion for OfflineDetector.
#34662 merged
Apr 22, 2025 -
Clickhouse IT RM JDBC fix + ClickHouseRowsCheck
#34407 merged
Apr 22, 2025 -
Update py container
#34691 merged
Apr 22, 2025 -
Update REVIEWERS.yml - remove lostluck from fallback
#34696 merged
Apr 21, 2025 -
Update release-guide.md
#34681 merged
Apr 21, 2025 -
Tour of Beam: update GroupByKey example
#33242 merged
Apr 21, 2025 -
Minor fixes to yaml testing docs.
#34618 merged
Apr 21, 2025 -
fix the doc
#34684 merged
Apr 21, 2025 -
Provide kafka unbounded reader to checkpoint mark when offset based deduplication is supported.
#34669 merged
Apr 19, 2025 -
Added run_rc_validation_python_yaml to validate the Beam YAML RC
#34670 merged
Apr 18, 2025 -
Update link in yaml-providers.md
#34570 merged
Apr 18, 2025 -
Back port virtualenv-clone to 2.64 Python SDK container
#34667 merged
Apr 18, 2025 -
Ignore test_flatten_and_gbk for spark and samza runners.
#34663 merged
Apr 18, 2025 -
Regenerate Python requirement
#34666 merged
Apr 18, 2025 -
Remove unnecessary checkNotNull
#34664 merged
Apr 18, 2025
25 Pull requests opened by 16 people
-
Add withReshuffle(boolean) option to FileIO.matchAll()
#34677 opened
Apr 20, 2025 -
Fix loopback
#34678 opened
Apr 20, 2025 -
Fix: Remove duplicate mutations in flushBatch to prevent Datastore commit
#34679 opened
Apr 20, 2025 -
[DO NOT MERGE] tried to add artifactApiService
#34689 opened
Apr 21, 2025 -
Enable cloudpickle by default
#34695 opened
Apr 21, 2025 -
Add option to pickler dumps() for best-effort determinism
#34698 opened
Apr 21, 2025 -
Support customizing how built-in types are pickled for cloudpickle
#34699 opened
Apr 21, 2025 -
complete implementation of open ai text embedding with test #new
#34700 opened
Apr 21, 2025 -
Fix handling of collections.Counter types
#34706 opened
Apr 22, 2025 -
Improve validation error message for chain transform
#34708 opened
Apr 22, 2025 -
Improve failure message when passing Pipeline object / PBegin objects…
#34716 opened
Apr 23, 2025 -
Remove caching for remainders of state backed iterables.
#34718 opened
Apr 23, 2025 -
[ManagedIO] Single source of truth for ManagedIO configs
#34720 opened
Apr 24, 2025 -
Attempt prism for pipelines with unbounded PCollections.
#34721 opened
Apr 24, 2025 -
Deep copy PipelineOptions and keep the input intact.
#34723 opened
Apr 24, 2025 -
Bump github.com/golang-cz/devslog from 0.0.11 to 0.0.12 in /sdks
#34728 opened
Apr 24, 2025 -
[WIP] Upgrade io.debezium using #33526 as a blueprint
#34733 opened
Apr 24, 2025 -
[WIP] Flink watermark emission
#34734 opened
Apr 24, 2025 -
[Do not merge] Parse struct returned from Dataflow API to BoundedTrieData
#34738 opened
Apr 24, 2025 -
Only declare state backed iterables on code channel.
#34740 opened
Apr 24, 2025 -
Remove myself from java reviewers and focus on python and go.
#34741 opened
Apr 25, 2025 -
Bump cloud.google.com/go/storage from 1.51.0 to 1.52.0 in /sdks
#34742 opened
Apr 25, 2025 -
Bump google.golang.org/api from 0.229.0 to 0.230.0 in /sdks
#34743 opened
Apr 25, 2025 -
Bump github.com/docker/docker from 28.0.4+incompatible to 28.1.1+incompatible in /sdks
#34744 opened
Apr 25, 2025 -
Bump cloud.google.com/go/spanner from 1.79.0 to 1.80.0 in /sdks
#34745 opened
Apr 25, 2025
25 Issues closed by 13 people
-
The pr-bot-pr-updates job is flaky
#34731 closed
Apr 25, 2025 -
The pr-bot-new-prs job is flaky
#34724 closed
Apr 25, 2025 -
The PreCommit Python job is flaky
#34739 closed
Apr 25, 2025 -
[Feature Request]: Include job name in GCS custom audit info
#31299 closed
Apr 24, 2025 -
[Bug]: org.apache.beam.sdk.util.construction.BeamUrns.getUrn fails when trying to parse Protobuf messages
#34625 closed
Apr 24, 2025 -
[Bug]: The python Flatten example shows an empty line 31 and nothing else
#29111 closed
Apr 24, 2025 -
[Bug]: ProtoCoder does not return correctly encoded TypeDescriptor
#34561 closed
Apr 24, 2025 -
Performance Regression or Improvement: cogbk_python_batch_load_test_reiterate_4times_10KB_values:runtime
#34688 closed
Apr 23, 2025 -
[Bug]: Beam YAML provider docs show unsupported provider configuration
#34646 closed
Apr 22, 2025 -
[Task]: Improve Yaml Integration test coverage for top 5 transformers missing
#34611 closed
Apr 22, 2025 -
[Bug]: Security vulnerability in parquet serialization and deserialization
#34543 closed
Apr 22, 2025 -
It should be an error to run a Pipeline without ever specifying options
#18699 closed
Apr 21, 2025 -
[Bug]: Python Unit Tests are flaky on windows
#34221 closed
Apr 20, 2025 -
[Bug]:
#34674 closed
Apr 19, 2025 -
Google cloud Console[Bug]: report
#34673 closed
Apr 19, 2025 -
[Task]: Add Validates Runner Yaml Suite (or add to existing suites)
#34648 closed
Apr 18, 2025 -
The PostCommit Python ValidatesRunner Samza job is flaky
#30657 closed
Apr 18, 2025 -
The PostCommit Python ValidatesRunner Spark job is flaky
#30645 closed
Apr 18, 2025 -
Performance Regression or Improvement: test_cloudml_benchmark_cirteo_no_shuffle_10GB-runtime_sec:runtime_sec
#34661 closed
Apr 18, 2025 -
The PostCommit Python Arm job is flaky
#30760 closed
Apr 18, 2025 -
The Python ValidatesContainer Dataflow ARM job is flaky
#33065 closed
Apr 18, 2025
13 Issues opened by 8 people
-
[Failing Test]: TestGCSIORetry.test_retry_on_throttling failed due to unknown reasons
#34736 opened
Apr 24, 2025 -
[Bug]: [Python SDK] Beam.FlatMap doesn't correctly parse TypeVar typehints
#34732 opened
Apr 24, 2025 -
[Bug]: Pipeline Options changed after pipeline execution
#34722 opened
Apr 24, 2025 -
[Task]: Remove plumbing of cache to state backed iterable.
#34717 opened
Apr 23, 2025 -
[Feature Request]: Generate Beam YAML managed I/O config parameters from the schema
#34714 opened
Apr 23, 2025 -
[Feature Request]: Reduce FnApi error logs for processing that is cancelled for expected reasons
#34705 opened
Apr 22, 2025 -
[Feature Request]: Improve failure message when passing Pipeline object / PBegin objects
#34703 opened
Apr 22, 2025 -
[Bug]: Beam YAML documentation not reflecting full parameters.
#34694 opened
Apr 21, 2025 -
[Bug]: Registering DoFns and CombineFns Seems Excessively Slow
#34693 opened
Apr 21, 2025 -
[Bug]: Improve the validation error when both chain and input are given for Beam YAML
#34690 opened
Apr 21, 2025 -
[Task]: Improve the schema support for Beam 3.0
#34672 opened
Apr 18, 2025 -
[Task]: Scope the work about improving Beam Python Typehints
#34671 opened
Apr 18, 2025
63 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
Add reshuffle before triggering load jobs.
#34657 commented on
Apr 21, 2025 • 8 new comments -
feat:large-row-skip-in-bigtable | added experimental options to skip …
#34245 commented on
Apr 24, 2025 • 8 new comments -
Spark Runner : Support for Streaming side-inputs for Spark Runner
#34560 commented on
Apr 24, 2025 • 4 new comments -
Python PTransform wrapper for AWS SQS
#34581 commented on
Apr 22, 2025 • 3 new comments -
Fail Fast if Resources Do Not Exist in Kafka Cluster.
#34659 commented on
Apr 24, 2025 • 2 new comments -
[Java] Ensure Pipeline Execution Requires Configuration Options or Logs Warning
#34220 commented on
Apr 21, 2025 • 2 new comments -
[KafkaIO] Improve caching in backlog estimation and processing
#34331 commented on
Apr 24, 2025 • 1 new comment -
[ManagedIO] Fail expansion when encountering extra or unknown configuration
#34525 commented on
Apr 23, 2025 • 1 new comment -
Add Triton Inference Server Support
#34252 commented on
Apr 18, 2025 • 0 new comments -
[Python] Fix WriteToBigQuery transform using CopyJob does not work with WRITE_TRUNCATE write disposition (#34247)
#34248 commented on
Apr 23, 2025 • 0 new comments -
[Java] Add parsedData to Hl7v2Message and Update HL7v2IO Docs
#34213 commented on
Apr 21, 2025 • 0 new comments -
Add support for top-level table properties table creation
#34205 commented on
Apr 21, 2025 • 0 new comments -
[KafkaIO] Remove duplicate offset in range check
#34201 commented on
Apr 21, 2025 • 0 new comments -
[DO NOT MERGE] Removed < 1.66 for grpcio
#34196 commented on
Apr 24, 2025 • 0 new comments -
Fix ProtoCoder NoSuchMethodException
#34194 commented on
Apr 22, 2025 • 0 new comments -
Rethrowing Exception from CassandraIO's ReadFn
#34191 commented on
Apr 21, 2025 • 0 new comments -
Bump @octokit/plugin-paginate-rest and @octokit/rest in /scripts/ci/issue-report
#34167 commented on
Apr 22, 2025 • 0 new comments -
Fallback Implementation
#34148 commented on
Apr 24, 2025 • 0 new comments -
[Bug]: FileIO.matchAll() injects a Reshuffle step which in some case is not useful and might break desirable fusion with more CPU intensive steps
#33330 commented on
Apr 18, 2025 • 0 new comments -
Key by paneindex and reshuffle before loading files.
#34324 commented on
Apr 19, 2025 • 0 new comments -
add graceful restart mechanism for GetWorkStream to prevent DEADLINE_…
#34367 commented on
Apr 23, 2025 • 0 new comments -
Bump @octokit/plugin-paginate-rest, @actions/github and @octokit/rest in /scripts/ci/pr-bot
#34377 commented on
Apr 18, 2025 • 0 new comments -
Add ability to specify partition bounds via jdbc xlang.
#34394 commented on
Apr 23, 2025 • 0 new comments -
sdks/python: enrich data with CloudSQL
#34398 commented on
Apr 23, 2025 • 0 new comments -
Concat protos in BQStorageWriteAPI - solve edge cases during mering of nested repeated fields
#34436 commented on
Apr 21, 2025 • 0 new comments -
Enabling long-running jobs to use federated STS assume role authentication for AWS resources.
#34440 commented on
Apr 18, 2025 • 0 new comments -
[AnomalyDetection] Add a notebook for anomaly detection with Z-Score
#34459 commented on
Apr 21, 2025 • 0 new comments -
Setting num_bytes to DEFAULT_READ_BUFFER_SIZE if it is None originally
#34475 commented on
Apr 25, 2025 • 0 new comments -
SnowflakeIO: filter on db and schema when searching for existing table
#34486 commented on
Apr 23, 2025 • 0 new comments -
✨ Upgrade sidepanel extension to JupyterLab 4.x compatibility [DO NOT MERGE]
#34495 commented on
Apr 22, 2025 • 0 new comments -
Disable logical type cast of fastavro
#34603 commented on
Apr 23, 2025 • 0 new comments -
Add changes note about TFRecord support in beam yaml and rename integration test.
#34635 commented on
Apr 25, 2025 • 0 new comments -
[Bug]: Flink/Samza/Spark Runner could not handle groupby and flatten sharing the same input
#34647 commented on
Apr 18, 2025 • 0 new comments -
[Bug]: Support error handling in PyTransform
#32332 commented on
Apr 18, 2025 • 0 new comments -
[Task]: Enrich ML features such as RAG, Anomaly Detection
#34553 commented on
Apr 18, 2025 • 0 new comments -
[Feature Request]: VertexAIImageEmbeddings adding the ability to send in contextual text too
#34236 commented on
Apr 20, 2025 • 0 new comments -
[Bug]: `beam.io.WriteToCsv` ignores `num_shards` argument.
#34188 commented on
Apr 20, 2025 • 0 new comments -
[Bug]: The submission_environment_dependencies.txt file does not get staged when running with Flink runner on Dataproc
#32743 commented on
Apr 20, 2025 • 0 new comments -
[Failing Test]: AttributeError for MaybeReshuffle in Python SDK
#33854 commented on
Apr 21, 2025 • 0 new comments -
[Task]: Consider reducing dependencies in core
#34554 commented on
Apr 21, 2025 • 0 new comments -
[Feature Request]: Implement JAR caching in the Beam YAML Provider
#33972 commented on
Apr 21, 2025 • 0 new comments -
[Feature Request]: Support for BigTable from YAML
#33902 commented on
Apr 21, 2025 • 0 new comments -
[Bug][Prism]: Prism gets stuck when trying to flatten 2 unbounded pcollections
#33815 commented on
Apr 21, 2025 • 0 new comments -
[Bug]: Docker worker containers are confusingly named.
#34350 commented on
Apr 23, 2025 • 0 new comments -
The PostCommit Java ValidatesRunner Flink Java8 job is flaky
#32949 commented on
Apr 24, 2025 • 0 new comments -
[Bug]: CombineGlobally (and possibly other combines) fails at pipeline translation time when given a side input
#30580 commented on
Apr 24, 2025 • 0 new comments -
[Bug]: KafkaIO unbounded read requires kafka connection on pipeline submission time
#34630 commented on
Apr 24, 2025 • 0 new comments -
The IcebergIO Integration Tests job is flaky
#31931 commented on
Apr 24, 2025 • 0 new comments -
Globbing asterisk cannot be escaped in file reads.
#21140 commented on
Apr 24, 2025 • 0 new comments -
[Bug]: gprcio limitation to < 1.66 in Python is problematic
#34081 commented on
Apr 24, 2025 • 0 new comments -
The PreCommit Python Coverage job is flaky
#30813 commented on
Apr 24, 2025 • 0 new comments -
[Bug]: Dataflow option validation fails in VSCode Jupyter notebook
#34101 commented on
Apr 25, 2025 • 0 new comments -
Support writing to Pubsub with ordering key; Add PubsubMessage SchemaCoder
#31608 commented on
Apr 22, 2025 • 0 new comments -
BigQueryIO uniformize direct and export reads
#32360 commented on
Apr 21, 2025 • 0 new comments -
add generics support to AutoValueUtils helpers
#32977 commented on
Apr 21, 2025 • 0 new comments -
Adding Google Storage Requester pays feature to Golang SDK.
#33236 commented on
Apr 21, 2025 • 0 new comments -
Bump com.diffplug.spotless:spotless-plugin-gradle from 5.6.1 to 7.0.2
#33599 commented on
Apr 23, 2025 • 0 new comments -
Bump com.diffplug.spotless from 5.6.1 to 7.0.2
#33600 commented on
Apr 23, 2025 • 0 new comments -
Bump @octokit/request-error, @actions/github and @octokit/rest in /scripts/ci/pr-bot
#33998 commented on
Apr 23, 2025 • 0 new comments -
[DO NOT MERGE] Experimental Iceberg sharding runs
#34020 commented on
Apr 21, 2025 • 0 new comments -
Switch to use registerFileSystemsOnce for SerializablePipelineOptions constructor
#34028 commented on
Apr 21, 2025 • 0 new comments -
add vendor to manually shutdown and restart GetWorkerMetadataStream to prevent DEADLINE_EXCEEDED errors
#34053 commented on
Apr 21, 2025 • 0 new comments -
Fix Docker build error by adding fallback for python3.12-distutils
#34144 commented on
Apr 21, 2025 • 0 new comments