Skip to content

[Bug]: Beam YAML provider docs show unsupported provider configuration #34646

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
1 of 17 tasks
ryanmadden-google opened this issue Apr 16, 2025 · 3 comments · Fixed by #34701
Closed
1 of 17 tasks

[Bug]: Beam YAML provider docs show unsupported provider configuration #34646

ryanmadden-google opened this issue Apr 16, 2025 · 3 comments · Fixed by #34701
Assignees

Comments

@ryanmadden-google
Copy link

What happened?

The Beam YAML provider docs show the following example provider configuration:

- type: yaml
  transforms:
    # Define the first transform of type "RaiseElementToPower"
    RaiseElementToPower:
      config_schema:
        properties:
          n: {type: integer}
      body:
        type: MapToFields
        config:
          language: python
          append: true
          fields:
            power: "element ** {{n}}"

    # Define a second transform that produces consecutive integers.
    Range:
      config_schema:
        properties:
          end: {type: integer}
      # Setting this parameter lets this transform type be used as a source.
      requires_inputs: false
      body: |
        type: Create
        config:
          elements:
            {% for ix in range(end) %}
            - {{ix}}
            {% endfor %}

and indicate the following use of the provided transforms:

transforms:
  - type: Range
    config:
      end: 10
  - type: RaiseElementToPower
    input: Range
    config:
      n: 3
  ...

However, providing and using RaiseElementToPower in this manner results in an error. For example, if the provider stanza is in provider.yaml and the following pipeline is run:

pipeline:
  type: chain
  transforms:
    - type: Range
      config:
        end: 4
    - type: RaiseElementToPower
      config:
        n: 2
    - type: LogForTesting
providers:
  - include: provider.yaml

Then the following error occurs: ValueError: Invalid transform specification at "RaiseElementToPower" at line 7: Missing inputs for transform at "MapToFields" at line 1

This error also occurs when the pipeline does not use type: chain and instead specifies each transform's inputs.

The format of the provider body definition seems to be the source of the issue. Related tests in the source do not exercise this style of provider definition and only cover the block string literal style used in Range above and the 'chain' style. For example:

...
          body:
            type: chain
            transforms:
              - type: MapToFields
                config:
                  language: python
                  append: true
                  fields:
                    power: "element**{{n}}"

Issue Priority

Priority: 2 (default / most bugs should be filed as P2)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam YAML
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Infrastructure
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Samza Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner
@TanuSharma2511
Copy link
Contributor

.take-issue

@chamikaramj
Copy link
Contributor

Hi @TanuSharma2511 could you confirm whether you are actively working on this ? This seems to be blocking another issue.

It's great if you are working on it, just wanted to confirm.

@liferoad liferoad added this to the 2.65.0 Release milestone Apr 21, 2025
@robertwb robertwb self-assigned this Apr 21, 2025
@robertwb
Copy link
Contributor

I think I've figured out what the issues is. I'll post a PR shortly.

robertwb added a commit to robertwb/incubator-beam that referenced this issue Apr 21, 2025
Previously the fact that a transform had inputs was not propagated
to the recursive transform creation, resulting in a validation error.

This fixes apache#34646.
robertwb added a commit to robertwb/incubator-beam that referenced this issue Apr 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants