Skip to content

[Bug]: Beam YAML provider docs show unsupported provider configuration #34646

Closed
@ryanmadden-google

Description

@ryanmadden-google

What happened?

The Beam YAML provider docs show the following example provider configuration:

- type: yaml
  transforms:
    # Define the first transform of type "RaiseElementToPower"
    RaiseElementToPower:
      config_schema:
        properties:
          n: {type: integer}
      body:
        type: MapToFields
        config:
          language: python
          append: true
          fields:
            power: "element ** {{n}}"

    # Define a second transform that produces consecutive integers.
    Range:
      config_schema:
        properties:
          end: {type: integer}
      # Setting this parameter lets this transform type be used as a source.
      requires_inputs: false
      body: |
        type: Create
        config:
          elements:
            {% for ix in range(end) %}
            - {{ix}}
            {% endfor %}

and indicate the following use of the provided transforms:

transforms:
  - type: Range
    config:
      end: 10
  - type: RaiseElementToPower
    input: Range
    config:
      n: 3
  ...

However, providing and using RaiseElementToPower in this manner results in an error. For example, if the provider stanza is in provider.yaml and the following pipeline is run:

pipeline:
  type: chain
  transforms:
    - type: Range
      config:
        end: 4
    - type: RaiseElementToPower
      config:
        n: 2
    - type: LogForTesting
providers:
  - include: provider.yaml

Then the following error occurs: ValueError: Invalid transform specification at "RaiseElementToPower" at line 7: Missing inputs for transform at "MapToFields" at line 1

This error also occurs when the pipeline does not use type: chain and instead specifies each transform's inputs.

The format of the provider body definition seems to be the source of the issue. Related tests in the source do not exercise this style of provider definition and only cover the block string literal style used in Range above and the 'chain' style. For example:

...
          body:
            type: chain
            transforms:
              - type: MapToFields
                config:
                  language: python
                  append: true
                  fields:
                    power: "element**{{n}}"

Issue Priority

Priority: 2 (default / most bugs should be filed as P2)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam YAML
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Infrastructure
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Samza Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions